On integrating the lexicon with the language model
نویسندگان
چکیده
The goal of this work was to develop an algorithm for the integration of the lexicon with the language model which would be computationally efficient in terms of memory requirements, even in the case of large trigram models. Two specialized versions of the algorithm for transducer composition were implemented. The first one is basically a composition algorithm that uses the precomputed set of the output labels that can be reached from a particular epsilon edge of the lexicon; the second includes an ”on the fly” implementation of the pushing of weights and output labels. Very significant memory savings were obtained with the proposed algorithms compared with the general determinization algorithm for weighted transducers.
منابع مشابه
Mental Representation of Cognates/Noncognates in Persian-Speaking EFL Learners
The purpose of this study was to investigate the mental representation of cognate and noncognate translation pairs in languages with different scripts to test the prediction of dual lexicon model (Gollan, Forster, & Frost, 1997). Two groups of Persian-speaking English language learners were tested on cognate and noncognate translation pairs in Persian-English and English-Persian directions with...
متن کاملA Stylistic Analysis of Lexicon in Ray Bradbury’s The Martian Chronicles
Ray Bradbury’s The Martian Chronicles is a futuristic, science fiction novel that chronicles the colonization of Mars by humans, projecting the United States’ colonial and immigrant past on to a symbolic future. Bradbury’s use of language is mostly picturesque and sensory. The present paper applies a text-oriented analysis of stylistic elements that construct meaning in the text and evoke the n...
متن کاملCode-Copying in the Balochi Language of Sistan
This empirical study deals with language contact phenomena in Sistan. Code-copying is viewed as a strategy of linguistic behavior when a dominated language acquires new elements in lexicon, phonology, morphology, syntax, pragmatic organization, etc., which can be interpreted as copies of a dominating language. In this framework Persian is regarded as the model code which provides elements for b...
متن کاملModels of EFL Learners’ Vocabulary Development: Spreading Activation vs. Hierarchical Network Model
Semantic network approaches view organization or representation of internal lexicon in the form of either spreading or hierarchical system identified, respectively, as Spreading Activation Model (SAM) and Hi- erarchical Network Model (HNM). However, the validity of either model is amongst the intact issues in the literature which can be studied through basing the instruction compatible wi...
متن کاملA Supervised Method for Constructing Sentiment Lexicon in Persian Language
Due to the increasing growth of digital content on the internet and social media, sentiment analysis problem is one of the emerging fields. This problem deals with information extraction and knowledge discovery from textual data using natural language processing has attracted the attention of many researchers. Construction of sentiment lexicon as a valuable language resource is a one of the imp...
متن کاملOn the Development of a Model of Cultural Identity and Language Achievement among Iranian Advanced EFL Learners
Culture is an inseparable part of a language. In other words, mastering a language and being able to communicate through it inevitably entails integrating with the culture of the speakers of that language which is the reflection of people's identity. The aim of the present study was designing a model of Iranian cultural identity. Initially, to select a homogeneous sample of learners at the adva...
متن کامل